AITopics | empirical fisher

Collaborating Authors

empirical fisher

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Improved Empirical Fisher Approximation for Natural Gradient Descent

Neural Information Processing SystemsMar-22-2026, 20:44:36 GMT

Approximate Natural Gradient Descent (NGD) methods are an important family of optimisers for deep learning models, which use approximate Fisher information matrices to pre-condition gradients during training. The empirical Fisher (EF) method approximates the Fisher information matrix empirically by reusing the per-sample gradients collected during back-propagation. Despite its ease of implementation, the EF approximation has its theoretical and practical limitations. This paper investigates the issue of EF, which is shown to be a major cause of its poor empirical approximation quality. An improved empirical Fisher (iEF) method is proposed to address this issue, which is motivated as a generalised NGD method from a loss reduction perspective, meanwhile retaining the practical convenience of EF.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.41)

Add feedback

46a558d97954d0692411c861cf78ef79-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 01:37:27 GMT

approximation, empirical fisher, fisher, (12 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.05)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(12 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

7cfd5df443b4eb0d69886a583b33de4c-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 12:35:38 GMT

algorithm, approximation, gradient, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

50d005f92a6c5c9646db4b761da676ba-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 22:39:03 GMT

Failure case 2: Augerino depends on the used parameterisation of invariance. The full GGN approximation in Eq. 5 is inO(NP2C) for computingN matrix-products. The diagonalGGNapproximation would be inO(NPC)and computation of the log-determinant onlyO(P). Computing the log-determinant can be done efficiently inO(D3 +G3)by decomposing the Kronecker factors (Immer et al., 2021a). The last two terms dependent onS come up due to the aggregation ofaugmentation samples inour approximation, that is,the expectations overaandg in the second line of Eq. 15.

approximation, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Limitations of the Empirical Fisher Approximation for Natural Gradient Descent

Neural Information Processing SystemsOct-2-2025, 15:51:48 GMT

Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam.

artificial intelligence, fisher, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.53)

Add feedback

WoodFisher: Efficient Second-Order Approximation for Neural Network Compression

Neural Information Processing SystemsAug-22-2025, 00:52:43 GMT

Second-order information, in the form of Hessian-or Inverse-Hessian-vector products, is a fundamental tool for solving optimization problems.

neural network, pruning, woodfisher, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
Europe > Austria (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

WoodFisher: Efficient Second-Order Approximation for Neural Network Compression

Neural Information Processing SystemsAug-16-2025, 14:24:08 GMT

Second-order information, in the form of Hessian-or Inverse-Hessian-vector products, is a fundamental tool for solving optimization problems.

approximation, neural network, pruning, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
Europe > Austria (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

d1ff1ec86b62cd5f3903ff19c3a326b2-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 14:23:58 GMT

We would like to thank the reviewers for their comments, and take the opportunity to answer their questions below. We thank the reviewer for the relevant [Amari et al., 2000] reference, which we will cite and discuss. Similarly, [Amari et al., 2000] considers single-layer networks Further, we examined the method's accuracy relative to recent techniques, and extended it to We are open to changing the term "WoodFisher" which we used as a mnemonic Please see Appendix S5 for ablation studies. For simplicity, we consider the scaling constant as 1 here. Thanks for the suggestions, we will correct the font sizes & the broken references.

pruning, reviewer, woodfisher, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

M-F AC: Efficient Matrix-Free Approximations of Second-Order Information

Neural Information Processing SystemsAug-15-2025, 10:07:27 GMT

Efficiently approximating local curvature information of the loss function is a key tool for optimization and compression of deep neural networks.

algorithm, approximation, gradient, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

On the Computation of the Fisher Information in Continual Learning

van de Ven, Gido M.

arXiv.org Machine LearningFeb-17-2025

Continual learning is a rapidly growing subfield of deep learning devoted to enabling neural networks to incrementally learn new tasks, domains or classes while not forgetting previously learned ones. Such continual learning is crucial for addressing real-world problems where data are constantly changing, such as in healthcare, autonomous driving or robotics. Unfortunately, continual learning is challenging for deep neural networks, mainly due to their tendency to forget previously acquired skills when learning something new. Elastic Weight Consolidation (EWC) [1], developed by Kirkpatrick and colleagues from DeepMind, is one of the most popular methods for continual learning with deep neural networks. To this day, this method is featured as a baseline in a large proportion of continual learning studies. However, in the original paper the exact implementation of EWC was not well described, and no official code was provided. A previous blog post by Huszár [2] already addressed an issue relating to how EWC should behave when there are more than two tasks.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2502.11756

Country: Europe > Belgium (0.28)

Genre: Research Report (0.40)

Industry: